NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Adaptive Degree-Based Conformal Prediction for Individual Treatment Effect Estimation on Networked Data

https://doi.org/10.1109/BigData66926.2025.11402563

Zhou, Xiaofan; Ma, Jing; Cheng, Lu (December 2025, IEEE)

Full Text Available
Causal Effect Estimation with Mixed Latent Confounders and Post-treatment Variables

Zhu, Yaochen; Ma, Jing; Wu, Liang; Guo, Qi; Hong, Liangjie; Li, Jundong (April 2025, International Conference on Learning Representations)

Causal inference from observational data has attracted considerable attention among researchers. One main obstacle is the handling of confounders. As direct measurement of confounders may not be feasible, recent methods seek to address the confounding bias via proxy variables, i.e., covariates postulated to be conducive to the inference of latent confounders. However, the selected proxies may scramble both confounders and post-treatment variables in practice, which risks biasing the estimation by controlling for variables affected by the treatment. In this paper, we systematically investigate the bias due to latent post-treatment variables, i.e., latent post-treatment bias, in causal effect estimation. Specifically, we first derive the bias when selected proxies scramble both latent confounders and post-treatment variables, which we demonstrate can be arbitrarily bad. We then propose a Confounder-identifiable VAE (CiVAE) to address the bias. Based on a mild assumption that the prior of latent variables that generate the proxy belongs to a general exponential family with at least one invertible sufficient statistic in the factorized part, CiVAE individually identifies latent confounders and latent post-treatment variables up to bijective transformations. We then prove that with individual identification, the intractable disentanglement problem of latent confounders and post-treatment variables can be transformed into a tractable independence test problem despite arbitrary dependence may exist among them. Finally, we prove that the true causal effects can be unbiasedly estimated with transformed confounders inferred by CiVAE. Experiments on both simulated and real-world datasets demonstrate significantly improved robustness of CiVAE.
more » « less
Full Text Available
Invariant shape representation learning for image classification

Hossain, Tonmoy; Ma, Jing; Li, Jundong; Zhang, Miaomiao (February 2025, IEEE)

Full Text Available
Fairness-Aware Graph Learning: A Benchmark

https://doi.org/10.1145/3711896.3737392

Dong, Yushun; Wang, Song; Lei, Zhenyu; Zheng, Zaiyi; Ma, Jing; Chen, Chen; Li, Jundong (August 2025, ACM)

Full Text Available
A review on knowledge graphs for healthcare: Resources, applications, and promises

https://doi.org/10.1016/j.jbi.2025.104861

Cui, Hejie; Lu, Jiaying; Xu, Ran; Wang, Shiyu; Ma, Wenjing; Yu, Yue; Yu, Shaojun; Kan, Xuan; Ling, Chen; Zhao, Liang; et al (September 2025, Journal of Biomedical Informatics)

Full Text Available
Global Graph Counterfactual Explanation: A Subgraph Mapping Approach

He, Yinhan; Zheng, Wendy; Zhu, Yaochen; Ma, Jing; Mishra, Saumitra; Raman, Natraj; Liu, Ninghao; Li, Jundong (March 2025, Transactions on machine learning research)

Graph Neural Networks (GNNs) have been widely deployed in various real-world applications. However, most GNNs are black-box models that lack explanations. One strategy to explain GNNs is through counterfactual explanation, which aims to find minimum perturbations on input graphs that change the GNN predictions. Existing works on GNN counterfactual explanations primarily concentrate on the local-level perspective (i.e., generating counterfactuals for each individual graph), which suffers from information overload and lacks insights into the broader cross-graph relationships. To address such issues, we propose GlobalGCE, a novel global-level graph counterfactual explanation method. GlobalGCE aims to identify a collection of subgraph mapping rules as counterfactual explanations for the target GNN. According to these rules, substituting certain significant subgraphs with their counterfactual subgraphs will change the GNN prediction to the desired class for most graphs (i.e., maximum coverage). Methodologically, we design a significant subgraph generator and a counterfactual subgraph autoencoder in our GlobalGCE, where the subgraphs and the rules can be effectively generated. Extensive experiments demonstrate the superiority of our GlobalGCE compared to existing baselines.
more » « less
Full Text Available
Causal Inference with Latent Variables: Recent Advances and Future Prospectives

https://doi.org/10.1145/3637528.3671450

Zhu, Yaochen; He, Yinhan; Ma, Jing; Hu, Mengxuan; Li, Sheng; Li, Jundong (August 2024, Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Causality lays the foundation for the trajectory of our world. Causal inference (CI), which aims to infer intrinsic causal relations among variables of interest, has emerged as a crucial research topic. Nevertheless, the lack of observation of important variables (e.g., confounders, mediators, exogenous variables, etc.) severely compromises the reliability of CI methods. The issue may arise from the inherent difficulty in measuring the variables. Additionally, in observational studies where variables are passively recorded, certain covariates might be inadvertently omitted by the experimenter. Depending on the type of unobserved variables and the specific CI task, various consequences can be incurred if these latent variables are carelessly handled, such as biased estimation of causal effects, incomplete understanding of causal mechanisms, lack of individual-level causal consideration, etc. In this survey, we provide a comprehensive review of recent developments in CI with latent variables. We start by discussing traditional CI techniques when variables of interest are assumed to be fully observed. Afterward, under the taxonomy of circumvention and inference-based methods, we provide an in-depth discussion of various CI strategies to handle latent variables, covering the tasks of causal effect estimation, mediation analysis, counterfactual reasoning, and causal discovery. Furthermore, we generalize the discussion to graph data where interference among units may exist. Finally, we offer fresh aspects for further advancement of CI with latent variables, especially new opportunities in the era of large language models (LLMs).
more » « less
Full Text Available
DPAR: Decoupled Graph Neural Networks with Node-Level Differential Privacy

https://doi.org/10.1145/3589334.3645531

Zhang, Qiuchen; Lee, Hong kyu; Ma, Jing; Lou, Jian; Yang, Carl; Xiong, Li (May 2024, International World Wide Web Conference (WWW))

Full Text Available
PyGDebias: A Python Library for Debiasing in Graph Learning

https://doi.org/10.1145/3589335.3651239

Dong, Yushun; Lei, Zhenyu; Zheng, Zaiyi; Wang, Song; Ma, Jing; Huang, Alex Jing; Chen, Chen; Li, Jundong (May 2024, Companion Proceedings of the ACM on Web Conference 2024)

Graph-structured data is ubiquitous among a plethora of real-world applications. However, as graph learning algorithms have been increasingly deployed to help decision-making, there has been rising societal concern in the bias these algorithms may exhibit. In certain high-stake decision-making scenarios, the decisions made may be life-changing for the involved individuals. Accordingly, abundant explorations have been made to mitigate the bias for graph learning algorithms in recent years. However, there still lacks a library to collectively consolidate existing debiasing techniques and help practitioners to easily perform bias mitigation for graph learning algorithms. In this paper, we present PyGDebias, an open-source Python library for bias mitigation in graph learning algorithms. As the first comprehensive library of its kind, PyGDebias covers 13 popular debiasing methods under common fairness notions together with 26 commonly used graph datasets. In addition, PyGDebias also comes with comprehensive performance benchmarks and well-documented API designs for both researchers and practitioners. To foster convenient accessibility, PyGDebias is released under a permissive BSD-license together with performance benchmarks, API documentation, and use examples at https://github.com/yushundong/PyGDebias.
more » « less
Full Text Available
Fair Few-Shot Learning with Auxiliary Sets

https://doi.org/10.3233/FAIA230556

Wang, Song; Ma, Jing; Cheng, Lu; Li, Jundong (September 2023, 26th European Conference on Artificial Intelligence)

Recently, there has been a growing interest in developing machine learning (ML) models that can promote fairness, i.e., eliminating biased predictions towards certain populations (e.g., individuals from a specific demographic group). Most existing works learn such models based on well-designed fairness constraints in optimization. Nevertheless, in many practical ML tasks, only very few labeled data samples can be collected, which can lead to inferior fairness performance. This is because existing fairness constraints are designed to restrict the prediction disparity among different sensitive groups, but with few samples, it becomes difficult to accurately measure the disparity, thus rendering ineffective fairness optimization. In this paper, we define the fairness-aware learning task with limited training samples as the fair few-shot learning problem. To deal with this problem, we devise a novel framework that accumulates fairness-aware knowledge across different meta-training tasks and then generalizes the learned knowledge to meta-test tasks. To compensate for insufficient training samples, we propose an essential strategy to select and leverage an auxiliary set for each meta-test task. These auxiliary sets contain several labeled training samples that can enhance the model performance regarding fairness in meta-test tasks, thereby allowing for the transfer of learned useful fairness-oriented knowledge to meta-test tasks. Furthermore, we conduct extensive experiments on three real-world datasets to validate the superiority of our framework against the state-of-the-art baselines.
more » « less
Full Text Available

« Prev Next »

Search for: All records